Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeepholePermalink.Cache #10042

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jglick
Copy link
Member

@jglick jglick commented Dec 10, 2024

Jenkins persistently caches permalink values such as lastUnstableBuild to a text file $JENKINS_HOME/job/$name/builds/permalinks, with the format

lastUnstableBuild 123
lastFailedBuild -1

where -1 indicates a negative cache. This was implemented because lazy loading of builds means we wish to avoid loading too many historical build records into memory. If you are so lucky as to have had thousands of builds without a single failure, without a cache the controller would need to search them all after every startup just to find that there has never been an unstable build. So the cache is read into memory at most once per job, and written whenever some permalink changes (typically after a build completes).

In CloudBees CI running in high availability mode this cache implementation was problematic and needed to be replaced with something appropriate for concurrent access. Originally I had tried invalidating the in-memory cache and letting it be reread from disk, but this was a bit inefficient and still vulnerable to race conditions. A more satisfactory solution is to extract the singleton implementation into a beta extension point, similarly to #9846.

The behavior for Jenkins should be identical, but there may be an ancillary benefit that (in my opinion) the refactored logic is easier to read and reason about. In the original code, an int variable was overloaded to mean a build number (when >0), a negative cache result when -1, or no cache result when 0. There actually appears to have been a mistake in the logic (albeit a harmless one) caused by treating 0 as if it were a legitimate build number. This seemed like a great opportunity to practice using sealed types as unions to catch such mistakes in the compiler. Additionally, all code relating to the file format (and the rather arbitrary choice of -1 to mean a negative cache, rather than say - or null) is encapsulated in the default cache implementation.

Testing done

The existing PeepholePermalinkTest (in the test module) should cover the existing logic. The viability of an alternate extension point implementation is covered by a test in CloudBees CI.

Proposed changelog entries

  • N/A

Proposed upgrade guidelines

N/A

Desired reviewers

Before the changes are marked as ready-for-merge:

Maintainer checklist

@jglick jglick requested a review from Vlatombe December 10, 2024 15:29
@@ -853,7 +853,7 @@ public RunT getBuildForCLI(@Argument(required = true, metaVar = "BUILD#", usage
}

/**
* Gets the youngest build #m that satisfies {@code n<=m}.
* Gets the oldest build #m that satisfies {@code m ≥ n}.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNearestBuild is not used by the code touched here, but I noticed that its Javadoc was simply wrong: it finds the oldest build newer than a given point, not the youngest build!

Also the comparison was phrased in the opposite direction of what you might intuitively expect.

@@ -977,7 +977,7 @@ public File getBuildDir() {
protected abstract void removeRun(RunT run);

/**
* Returns the last build.
* Returns the newest build.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than repeating the terminology used in the method name, trying to clarify.

@@ -868,7 +868,7 @@ public RunT getNearestBuild(int n) {
}

/**
* Gets the latest build #m that satisfies {@code m<=n}.
* Gets the newest build #m that satisfies {@code mn}.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably clearer phrasing. Encountered while studying the behavior of getNearestOldBuild(0) (see elsewhere).

assertThat(p.getNearestBuild(6), nullValue());
assertThat(p.getNearestBuild(7), nullValue());
assertThat(p.getNearestOldBuild(-1), nullValue());
assertThat(p.getNearestOldBuild(0), nullValue());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original implementation in fact called getNearestOldBuild(0), which unconditionally returns null. You can deduce this from the original Javadoc but the default implementation (overridden in LazyBuildMixIn) based on TailMap was potentially confusing, since the map in question is sorted in reverse order. I noticed there was no direct test coverage of these methods and decided to add some to clarify the meaning.

}

// the cache is stale. start the search
if (b == null) {
b = job.getNearestOldBuild(n);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to realize that this was sometimes called with n == 0 in which case the return value of getNearestOldBuild was unconditionally null, so the clause had no effect (b == null before and after). Now this method is called only when processing Some, with a positive argument.

Comment on lines -136 to -138
//noinspection StatementWithEmptyBody
for ( ; b != null && !apply(b); b = b.getPreviousBuild())
;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Git diff unfortunately fails to associate this hunk with the original method call. I just converted the body into a while loop for clarity.

Comment on lines +234 to +235
var cached = cache.get(id);
return cached != null ? cached : UNKNOWN;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Map.getOrDefault cannot be used because it is not typed to permit a return value wider than the map value type.

for (var entry : cache.entrySet()) {
cw.write(entry.getKey());
cw.write(' ');
cw.write(Integer.toString(entry.getValue() instanceof Cache.Some some ? some.number : -1));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this can be written more clearly in Java 21:

--- core/src/main/java/jenkins/model/PeepholePermalink.java
+++ core/src/main/java/jenkins/model/PeepholePermalink.java
@@ -248,7 +248,10 @@ public abstract class PeepholePermalink extends Permalink implements Predicate<R
                         for (var entry : cache.entrySet()) {
                             cw.write(entry.getKey());
                             cw.write(' ');
-                            cw.write(Integer.toString(entry.getValue() instanceof Cache.Some some ? some.number : -1));
+                            cw.write(Integer.toString(switch (entry.getValue()) {
+                                case Some some -> some.number;
+                                case None none -> -1;
+                            }));
                             cw.write('\n');
                         }
                         cw.commit();

Run<?, ?> resolve(@NonNull PeepholePermalink pp, @NonNull Job<?, ?> job, @NonNull String id);

/**
* Partial implementation of {@link #resolve(PeepholePermalink, Job, String)} when searching.
Copy link
Member Author

@jglick jglick Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(broken out of the original resolve method; the rest is mostly now in Some)

@timja timja added the skip-changelog Should not be shown in the changelog label Dec 11, 2024
Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/label ready-for-merge


This PR is now ready for merge, after ~24 hours, we will merge it if there's no negative feedback.

Thanks!

@comment-ops-bot comment-ops-bot bot added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback skip-changelog Should not be shown in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants